1,334 research outputs found
PAC Classification based on PAC Estimates of Label Class Distributions
A standard approach in pattern classification is to estimate the
distributions of the label classes, and then to apply the Bayes classifier to
the estimates of the distributions in order to classify unlabeled examples. As
one might expect, the better our estimates of the label class distributions,
the better the resulting classifier will be. In this paper we make this
observation precise by identifying risk bounds of a classifier in terms of the
quality of the estimates of the label class distributions. We show how PAC
learnability relates to estimates of the distributions that have a PAC
guarantee on their distance from the true distribution, and we bound the
increase in negative log likelihood risk in terms of PAC bounds on the
KL-divergence. We give an inefficient but general-purpose smoothing method for
converting an estimated distribution that is good under the metric into a
distribution that is good under the KL-divergence.Comment: 14 page
The Complexity of the Homotopy Method, Equilibrium Selection, and Lemke-Howson Solutions
We show that the widely used homotopy method for solving fixpoint problems,
as well as the Harsanyi-Selten equilibrium selection process for games, are
PSPACE-complete to implement. Extending our result for the Harsanyi-Selten
process, we show that several other homotopy-based algorithms for finding
equilibria of games are also PSPACE-complete to implement. A further
application of our techniques yields the result that it is PSPACE-complete to
compute any of the equilibria that could be found via the classical
Lemke-Howson algorithm, a complexity-theoretic strengthening of the result in
[Savani and von Stengel]. These results show that our techniques can be widely
applied and suggest that the PSPACE-completeness of implementing homotopy
methods is a general principle.Comment: 23 pages, 1 figure; to appear in FOCS 2011 conferenc
The exact sample complexity of PAC-learning problems with unit VC dimension
The Vapnik-Chervonenkis (VC) dimension is a combinatorial measure of a certain class of machine learning problems, which may be used to obtain upper and lower bounds on the number of training examples needed to learn to prescribed levels of accuracy. Most of the known bounds apply to the Probably Approximately Correct (PAC) framework, which is the framework within which we work in this paper. For a learning problem with some known VC dimension, much is known about the order of growth of the sample-size requirement of the problem, as a function of the PAC parameters. The exact value of sample-size requirement is however less well-known, and depends heavily on the particular learning algorithm being used. This is a major obstacle to the practical application of the VC dimension. Hence it is important to know exactly how the sample-size requirement depends on VC dimension, and with that in mind, we describe a general algorithm for learning problems having VC dimension 1. Its sample-size requirement is minimal (as a function of the PAC parameters), and turns out to be the same for all non-trivial learning problems having VC dimension 1. While the method used cannot be naively generalised to higher VC dimension, it suggests that optimal algorithm-dependent bounds may improve substantially on current upper bounds
On Revenue Maximization with Sharp Multi-Unit Demands
We consider markets consisting of a set of indivisible items, and buyers that
have {\em sharp} multi-unit demand. This means that each buyer wants a
specific number of items; a bundle of size less than has no value,
while a bundle of size greater than is worth no more than the most valued
items (valuations being additive). We consider the objective of setting
prices and allocations in order to maximize the total revenue of the market
maker. The pricing problem with sharp multi-unit demand buyers has a number of
properties that the unit-demand model does not possess, and is an important
question in algorithmic pricing. We consider the problem of computing a revenue
maximizing solution for two solution concepts: competitive equilibrium and
envy-free pricing.
For unrestricted valuations, these problems are NP-complete; we focus on a
realistic special case of "correlated values" where each buyer has a
valuation v_i\qual_j for item , where and \qual_j are positive
quantities associated with buyer and item respectively. We present a
polynomial time algorithm to solve the revenue-maximizing competitive
equilibrium problem. For envy-free pricing, if the demand of each buyer is
bounded by a constant, a revenue maximizing solution can be found efficiently;
the general demand case is shown to be NP-hard.Comment: page2
Constructing computer virus phylogenies
There has been much recent algorithmic work on the problem of reconstructing the evolutionary history of biological species. Computer virus specialists are interested in finding the evolutionary history of computer viruses - a virus is often written using code fragments from one or more other viruses, which are its immediate ancestors. A phylogeny for a collection of computer viruses is a directed acyclic graph whose nodes are the viruses and whose edges map ancestors to descendants and satisfy the property that each code fragment is "invented" only once. To provide a simple explanation for the data, we consider the problem of constructing such a phylogeny with a minimum number of edges. In general this optimization problem is NP-complete; some associated approximation problems are also hard, but others are easy. When tree solutions exist, they can be constructed and randomly sampled in polynomial time
The Hairy Ball Problem is PPAD-Complete
The Hairy Ball Theorem states that every continuous tangent vector field on an even-dimensional sphere must have a zero. We prove that the associated computational problem of computing an approximate zero is PPAD-complete. We also give a FIXP-hardness result for the general exact computation problem.
In order to show that this problem lies in PPAD, we provide new results on multiple-source variants of End-of-Line, the canonical PPAD-complete problem. In particular, finding an approximate zero of a Hairy Ball vector field on an even-dimensional sphere reduces to a 2-source End-of-Line problem. If the domain is changed to be the torus of genus g >= 2 instead (where the Hairy Ball Theorem also holds), then the problem reduces to a 2(g-1)-source End-of-Line problem.
These multiple-source End-of-Line results are of independent interest and provide new tools for showing membership in PPAD. In particular, we use them to provide the first full proof of PPAD-completeness for the Imbalance problem defined by Beame et al. in 1998
- …